The goal of causal mediation is, broadly, to decompose the effect of a treatment on an outcome (total effect, TE) into the direct effect (DE) caused by treatment and the indirect effect of treatment by way of some mediators. That is, if we have \(p\) mediators: \[ \text{TE} = \text{DE} + \text{IDE}_1 +\ldots+ \text{IDE}_p = \text{DE} + \text{TIDE} \]

Historically, such models have been fairly simple: one treatment, one mediator, one outcome, some set of confounders. There have been a few papers that allow for multiple outcomes: Schaid et al. (2022) and Zhao et al. (2023). However, both of these methods assume that the treatment, mediators, and outcomes are jointly normal. This precludes the use of a binary treatment.

The proposed model does not make any joint distributional assumptions and allows for binary treatments.

In most of the below models, the mediators are taken to be the taxa in the respective biome, aggregated to the genus level, and transformed with via CLR.

There are several important limitations to note, which are being addressed in current and future improvements:

Methodology and data summary

The general form of the considered mediation models is seen in Figure 1. The set of mediators consists of some subset of saliva microbiome, plaque microbiome, and oral health outcomes (PI, ICDAS, DS, MS, FS, DT, MT, and FT). The set of outcomes consists of some subset of oral health outcomes (surfaces only) and birth outcomes (Adverse birth outcome and birth weight).

General mediation model

General mediation model

The microbiome data were first grouped at the genus level before being transformed to relative abundance within individual and transformed with the centered log-ratio transformation to better approximate normality.

Plaque index (PI) and the international caries detection and assessment system (ICDAS) are presented on their original scale as their distributions were roughly symmetric. Each of decayed, missing, and filled surfaces (DS, MS, and FS, respectively), as true count data, were transformed with a square root before standardization so as to render their distributions more continuous and/or approximately normal. For example: \[ \text{MS}'=\frac{\sqrt{\text{MS}_i}-n^{-1}\sum_{i=1}^n \sqrt{\text{MS}_i}}{\text{sd}(\sqrt{\text{MS}})} \]

Interpretations, therefore, need to be made in terms of changes in standard deviations of the square root of the considered variable relative to the mean of the variable on the square root scale. Adverse birth outcomes is a binary variable and was thus left as-is albeit in violation of our normality assumption. Birth weight was sufficiently symmetric and was therefore left on its orignial scale.

The set of considered confounders: age, race, Hispanic (Y/N), yeast infection (Y/N), on antiobiotics (Y/N), on antifungal (Y/N), brushing twice daily (Y/N), prenatal inhaler use, prenatal diabetes status, prenatal asthma status, prenatal emotional condition, prenatal hypertension status, prenatal smoking status, employment status, modified education level (middle school/high school, Associates’ degree, or Bachelor’s degree), marriage status, number of children, cortisol level (log), estradiol level (log), progesterone level (log), testosterone level (log), T3 level (log), T4 level (log), and gestational age at first visit.

Included below are tables corresponding to each of the nine considered models. Each table contains the selected partial indirect effects (PIDEs), total indirect effect (TIDE), and direct effect (DE) of exposure (C. albicans is present). Those in the “Estimand” column that are not explicitly labeled with an effect are assumed to be the partial indirect effect from exposure to the corresponding outcome, through that mediator. Additionally, each table contains the mean and standard deviation of 5000 bootstrap samples for each of the above effects, as well as a bootstrap \(p\)-value (where the null hypothesis is an effect of 0) and two forms of bootstrap confidence intervals.

Proceeding each table is a visualization of the (partially) selected directed acyclic graph (DAG). Note that, for several of the considered models, these DAGs do not contain all selected mediators. Rather, due to limitations of interpretability, they include only those mediators that have at least one statistically significant effect. Double arrows from a mediator to an outcome represent mediation paths (i.e. \(\text{exposure}\to\text{mediator}\to\text{outcome}\)) that are statistically significant at the \(\alpha=0.05\) level based on the bootstrap inference without adjusting for multiplicity. After the DAG comes a short summary list containing all the estimated effects for each outcome. The only information contained here that is not also present in the searchable tables are the TEs, which are simply the summation of the TIDE and DE for that given outcome.

Example conclusions can be framed as follows:



Model 1: \(\text{Ca present}\to\text{Saliva microbiome}\to\text{OH (surfaces)}\)

Model 1 DAG

Model 1 DAG


The effect estimates for each outcome in model 1 are as follows:




Model 2: \(\text{Ca present}\to\text{Plaque microbiome}\to\text{OH (surfaces)}\)

Model 2 DAG

Model 2 DAG


The effect estimates for each outcome in model 2 are as follows:




Model 3: \(\text{Ca present}\to\text{OH (all)}\to\text{Birth outcomes}\)

Model 3 DAG

Model 3 DAG


The effect estimates for each outcome in model 3 are as follows:




Model 4: \(\text{Ca present}\to(\text{Saliva microbiome, OH (all)})\to\text{Birth outcomes}\)

Model 4 DAG

Model 4 DAG


The effect estimates for each outcome in model 4 are as follows:




Model 5: \(\text{Ca present}\to(\text{Plaque microbiome, OH (all)})\to\text{Birth outcomes}\)

Model 5 DAG

Model 5 DAG


The effect estimates for each outcome in model 5 are as follows:




Model 6: \(\text{Ca present}\to\text{Saliva microbiome}\to(\text{OH (surfaces), Birth})\)

Model 6 DAG

Model 6 DAG


The effect estimates for each outcome in model 6 are as follows:




Model 7: \(\text{Ca present}\to\text{Plaque microbiome}\to(\text{OH (surfaces), Birth})\)

Model 7 DAG

Model 7 DAG


The effect estimates for each outcome in model 7 are as follows:




Model 8: \(\text{Ca present}\to\text{Salive microbiome}\to(\{\text{OH (surfaces)\}, \{Birth}\})\)

Model 8 DAG

Model 8 DAG


The effect estimates for each outcome in model 8 are as follows:




Model 9: \(\text{Ca present}\to\text{Plaque microbiome}\to(\{\text{OH (surfaces)\}, \{Birth}\})\)

Model 9 DAG

Model 9 DAG


The effect estimates for each outcome in model 9 are as follows: